De Novo Genome Assembly ◾ 103
(e.g., NCBI genome) for the organism, and use “-r” and “-g” options as follows (you may
need to decompress GFF file) (Figures 3.11 and 3.12):
quast.py \
-o ref_quast_Ecoli_ass \
-t 4 \
-r ecolref/GCF_000005845.2_ASM584v2_genomic.fna.gz \
-g ecolref/GCF_000005845.2_ASM584v2_genomic.gff \
abyss_ecoli_ass.fasta \
spades_ecoli_ass.fasta \
spades_hyb_ecoli_ass.fasta
3.3.2 Evolutionary Assessment for De Novo Genome Assembly
Rather than contig or scaffold length distributions such as N50 and L50, the evolutionary
assessment for a genome assembly is based on the completeness of a genome on the evo-
lutionary informed expectation of genes inferred from closely related orthologous groups
of sequences. They assess the completeness of a genome assembly in terms of gene content.
BUSCO (Benchmarking Universal Single-Copy Orthologs) [12, 13] is an evolutionary-
based quality assessment program that uses information of known genes from a database
FIGURE 3.11 QUAST assembly assessment report (reference-guided assessment).